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Abstract 

We prove the following surprising result: given any quantum state p on n qubits, there exists 
a local Hamiltonian H on poly {n) qubits (e.g., a sum of two-qubit interactions), such that any 
ground state of H can be used to simulate p on all quantum circuits of fixed polynomial size. 
In terms of complexity classes, this implies that BQP/qpoly C QMA/poly, which supersedes the 
previous result of Aaronson that BQP/qpoly C PP/poly. Indeed, we can exactly characterize 
quantum advice, as equivalent in power to untrusted quantum advice combined with trusted 
classical advice. 

Proving our main result requires combining a large number of previous tools — including a 
result of Alon et al. on learning of real- valued concept classes, a result of Aaronson on the learn- 
ability of quantum states, and a result of Aharonov and Regev on 'QMA_|_ super- verifiers' — and 
also creating some new ones. The main new tool is a so-called majority- certificates lemma, 
which is closely related to boosting in machine learning, and which seems likely to find inde- 
pendent applications. In its simplest version, this lemma says the following. Given any set 
S of Boolean functions on n variables, any function f € S can be expressed as the pointwise 
majority of m = O (n) functions /i, . . . , /,„ S 5, such that each fi is the unique function in S 
compatible with ©(loglS*!) input/output constraints. 
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1 Introduction 

How much classical information is needed to specify a quantum state of n qubits? 

This question has inspired a rich and varied set of responses, in part because it can be interpreted 
in many ways. If we want to specify a quantum state p exactly, then of course the answer is 'an 
infinite amount,' since amplitudes in quantum mechanics are continuous. A natural compromise is 
to try to specify p approximately, i.e., to give a description which yields a state /) whose statistical 
behavior is close to that of p under every measurement. (This statement is captured by the 
requirement that p and p are close under the so-called trace distance metric.) But it is not hard 
to see that even for this task, we still need to use an exponential (in n) number of classical bits. 

This fact can be viewed as a disappointment, but also as an opportunity, since it raises the 
prospect that we might be able to encode massive amounts of information in physically compact 
quantum states: for example, we might hope to store 2" classical bits in n qubits. But an obvious 
practical requirement is that we be able to retrieve the information reliably, and this rules out the 
hope of significant 'quantum compression' of classical strings, as shown by a landmark result of 
Holevo pO] from 1973. Consider a sender Alice and a recipient Bob, with a one-way quantum 
channel between them. Then Holevo's Theorem says that, if Alice wants to encode an n-bit 
classical string x into an m-qubit quantum state px, in such a way that Bob can retrieve x (with 
probability 2/3, say) by measuring px, then Alice must take m ^ n — O (1) (or jn ^ ?i/2 — O (1), 
if Alice and Bob share entanglement). In other words, for this communication task, quantum 
states offer essentially no advantage over classical strings. In 1999, Ambainis et al. |12] generalized 
Holevo's result as follows: even if Bob wants to learn only a single bit Xi of (for some 

i G [n] unknown to Alice), and is willing to destroy the state px in the process of learning that bit, 
Alice still needs to send m = Vt (n) qubits for Bob to succeed with high probability. 

These results say that the exponential descriptive complexity of quantum states cannot be 
effectively harnessed for classical data storage, but they do not bound the number of practically 
meaningful 'degrees of freedom' in a quantum state used for purposes other than storing data. For 
example, a quantum state could be useful for computation, or it could be a physical system worthy 
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of study in its own right. The question then becomes, what useful information can we give about 
an n-qubit state using a 'reasonable' number (say, poly (n)) of classical bits? 

One approach to this question is to identify special subclasses of quantum states for which a 
faithful approximation can be specified using only poly (n) bits. This has been done, for example, 
with matrix product states [29] and 'tree states' [T|. A second approach is to try to describe an 
arbitrary n-qubit state p concisely, in such a way that the state p recovered from the description is 
close to p with respect to some natural subclass of measurements. This has been done for specific 
classes like the 'pretty good measurements' of Hausladen and Wootters [H]. A more ambitious goal 
in this vein, explored by Aaronson in two previous works [H [5] and continued in the present paper, 
is to give a description of an n-qubit state p which yields a state p that behaves approximately 
like p with respect to all (binary) measurements performable by quantum circuits of 'reasonable' 
size — say, of size at most n'^, for some fixed c > 0. Then if c is taken large enough, p is arguably 
'just as good' as p for practical purposes. 

Certainly we can achieve this goal using 2"''^°'^' bits: simply give approximations to the mea- 
surement statistics for every size-n'^ circuit. However, the results of Holevo [20j and Ambainis et 
al. [12] suggest that a much more succinct description might be possible. This hope was realized 
by Aaronson [2], who gave a description scheme in which an n-qubit state can be specified using 
poly (n) classical bits. There is a significant catch in Aaronson's result, though: the encoder Alice 
and decoder Bob both need to invest exponential amounts of computation. 

In a subsequent paper [5J , Aaronson gave a closely-related result which significantly reduces the 
computational requirements: now Alice can generate her message in polynomial time (for fixed c). 
Also, while Bob cannot necessarily construct the state p efficiently on his own, if he is presented 
with such a state (by an untrusted prover, say), Bob can verify the state in polynomial time. 
The catch in this result is a weakened approximation guarantee: Bob cannot use p to predict the 
outcomes of all the measurements defined by size-n^ circuits, but only most of them (with respect 
to a samplable distribution used by Alice in the encoding process). Aaronson [2 [5] conjectured 
that the tradeoff between this result and the previous one revealed an inherent limit to quantum 
compression. 

1.1 Our Quantum Information Result 

The main result of this paper is that Aaronson's conjecture was false: one really can get the best 
of both worlds, and simulate an arbitrary quantum state p on all small circuits, using a different 
state p that is easy to recognize. Indeed, we can even take p to be the ground state of a local 
Hamiltonian: that is, the unique pure state p= \ip) {ip\ on poly (n) qubits that is compatible with 
poly (n) local constraints, each involving a constant number of qubits. In a sense, then, this paper 
completes a 'trilogy' of which [21 [5] were the first two installments. 
Here is a formal statement of our result. 

Theorem 1 Let c,e > 0, and let p be any n-qubit quantum state. Then there exists a 2-local 
Hamiltonian H on poly (n, ^) qubits with unique ground state lip) (V'l, and a transformation C — > 
C of quantum circuits, computable in time poly(n, 1/e) given H, such that the following holds: 
\C' {\tp) {tp\) — C {p)\ < e for any measurement C definable by a quantum circuit of size n'^. (Here 
C [p) is the probability that C accepts p.) 

In other words, the ground states of local Hamiltonians are 'universal quantum states' in a 
very non-obvious sense. For example, suppose you own a quantum software store, which sells 
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quantum states p that can be fed as input to quantum computers. Then our result says that 
ground states of local Hamiltonians are the only kind of state you ever need to stock. What makes 
this surprising is that being a good piece of quantum software might entail satisfying an exponential 
number of constraints: for example, if p is supposed to help a customer's quantum computer Q 
evaluate some Boolean function / : {0,1}"" — )• {0,1}, then Q{p,x) should output / (x) for every 
input X G {0, 1}". By contrast, any A;-local Hamiltonian H can be described as a set of at most 
(fc) ~ 0{n^) constraints. 

One can also interpret Theorem[T]as a statement about communication over quantum channels. 
Suppose Alice (who is computationally unbounded) has a classical description of an n-qubit state 
p. She would like to describe p to Bob (who is computationally bounded), at least well enough 
for Bob to be able to simulate p on all quantum circuits of some fixed polynomial size. However, 
Alice cannot just send p to Bob, since her quantum communication channel is noisy and there is a 
chance that p might get corrupted along the way. Nor can she send a faithful classical description 
of p, since that would require an exponential number of bits. Our result provides an alternative: 
Alice can send a different quantum state a, of poly(n) qubits, together with a poly(n)-bit classical 
string X. Then, Bob can use x to verify that a can be used to accurately simulate p on all small 
measurements. 

We believe Theorem [1] makes a significant contribution to the study of the effective information 
content of quantum states. It does, however, leave open whether a quantum state of n qubits can 
be efficiently encoded and decoded in polynomial time, in a way that is 'good enough' to preserve 
the measurement statistics of measurements defined by circuits of fixed polynomial size. This 
remains an important problem for future work. 

1.2 Impact on Quantum Complexity Theory 

The questions addressed in this paper, and our results, are naturally phrased and proved in terms 
of complexity classes. In recent years, researchers have defined quantum complexity classes as a 
way to study the 'useful information' embodied in quantum states. One approach is to study the 
power of nonuniform quantum advice. The class BQP/qpoly, defined by Nishimura and Yamakami 
[25j, consists of all languages decidable in polynomial time by a quantum computer, with the help 
of a poly (n)-qubit advice state that depends only on the input length n. This class is analogous 
to the classical class P/poly. To understand the role of quantum information in determining the 
power of BQP/qpoly, a useful benchmark of comparison is the class BQP/poly of decision problems 
efficiently solvable by a quantum computer with poly (n) bits of classical advice. It is open whether 
BQP/qpoly = BQP/poly. 

A second approach studies the power of quantum proof systems, by analogy with the classical 
class NP. Kitaev (unpublished, 1999) defined the complexity class now called QMA, for 'Quantum 
Merlin-Arthur'. This is the class of decision problems for which a 'yes' answer can be proved 
by exhibiting a quantum witness state (or quantum proof) {ip), on poly (n) qubits, which is then 
checked by a skeptical polynomial-time quantum verifier. A natural benchmark class is QCMA 
(for 'Quantum Classical Merlin- Arthur'), defined by Aharonov and Naveh [8]. This is the class of 
decision problems for which a 'yes' answer can be checked by a quantum verifier who receives a 
classical witness. Here the natural open question is whether QMA = QCMA. 

In this paper we prove a new upper bound on BQP/qpoly: 

Theorem 2 BQP/qpoly C QMA/poly. 
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Previously Aaronson showed in [2] that BQP/qpoly C PP/poly, and showed in [5] that BQP/qpoly 
is contained in the 'heuristic' class HeurQMA/poly. Theorem [2] supersedes both of these earlier 
results. 

Theorem [2] says that one can always replace polynomial-size quantum advice by polynomial- 
size classical advice, together with a polynomial-size quantum witness (or equivalently, untrusted 
quantum advice). Indeed, we can characterize the class BQP/qpoly, as equal to the subclass of 
QMA/poly in which the quantum witness state iV'n) can only depend on the input length n0 

Using Theorem [21 we also obtain several other results for quantum complexity theory: 

(1) Without loss of generality, every quantum advice state can be taken to be the ground state 
of some local Hamiltonian H. (This essentially follows by combining our BQP/qpoly C 
QMA/poly result with the result of Kitaev that Local Hamiltonians is QMA-complete.) 

(2) It is open whether for every local Hamiltonian H on n qubits, there exists a quantum circuit of 
size poly (n) that prepares a ground state of H. It is easy to show that an affirmative answer 
would imply QMA = QCM A. As a consequence of Theorem[2l we can show that an affirmative 
answer would also imply BQP/qpoly = BQP/poly — thereby establishing a previously-unknown 
connection between quantum proofs and quantum advice. 

(3) We generalize Theorem[2]to show that QCMA/qpoly C QMA/poly. 

(4) We also use our new characterization of BQP/qpoly to prove a quantum analogue of the Karp- 
Lipton Theorem [23]. Recall that the Karp-Lipton Theorem says that if NP C P/poly, then 
the polynomial hierarchy collapses to the second level. Our 'Quantum Karp-Lipton Theorem' 
says that if NP C BQP/qpoly (that is, NP-complete problems are efficiently solvable with the 
help of quantum advice), then ilj C QMAPromiseQMA^ know, this is the first 
nontrivial result to derive unlikely consequences from a hypothesis about quantum machines 
being able to solve NP-complete problems in polynomial time. 

Finally, using our result, we are able to explain a previously- mysterious aspect of a 2000 paper of 
Watrous |31j . Watrous gave the best-known example of a problem in QMA that is not obviously in 
QCMA — that is, for which quantum proofs actually seem to helpH This problem is called GROUP 
Non-Membership, and is defined as follows: Arthur is given a finite black-box group G and a 
subgroup H < G (specified by their generators), as well as an element x £ G. His task is to verify 
that X ^ H. It is known that, as a black-box problem, this problem is not in MA. But Watrous 
showed that Group Non-Membership is in QMA, since Merhn can always persuade Arthur that 
X ^ H hy giving him the following quantum proof: 



Arthur's verification procedure consists of two tests. In the first test, Arthur assumes that Merlin 
sent \H), and then uses \H) to decide whether x £ H. The test is a simple, beautiful illustration 
of the power of quantum algorithms. The second test in Watrous's protocol confirms that Merlin 

^We call this restricted class YQP/poly; in another notation it would be OQMA/poly n coOQMA/poly (where the 
stands for 'oblivious'). 

^Aaronson and Kuperberg [6], however, give evidence that this problem might be in QCMA, under conjectures 
related to the Classification of Finite Simple Groups. 
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really sent \H) or at least, a state which is 'equivalent' for purposes of the first test. This second 
test and its analysis are considerably more involved, and seem less 'natural'. So the question arises: 
can the second test be omitted? 

Using our results, we find that a slightly weaker version of Watrous's second test can be derived 
in an almost automatic way from his first test, as follows. If we assume that the black-box group 
H = Hn IS fixed for each input length, then Group Non-Membership is in BQP/qpoly, by letting 
\Hn) as above be the trusted advice for length n and using Watrous' first test as the BQP/qpoly 
algorithm. Then Theorem [2] (which can be readily adapted to the black-box setting) tells us that 
Group Non-Membership is in QMA/poly as well. 

1.3 Proof Overview 

We now give an overview of the proof Theorem [2l that BQP/qpoly C QMA/poly. As we will 
explain, our proof rests on a new idea we call the 'majority-certificates' technique, which is not 
specifically quantum and which seems likely to find other applications. 

We begin with a language L £ BQP/qpoly and, for n > 0, a poly(n)-size quantum circuit 
Q (x,^) that computes L(x) with high probability when given the 'correct' advice state = pn on 
poly (n) qubits. The challenge, then, is to force Merlin to supply a witness state p' that behaves 
like pn on every input x G {0, 1}". 

Every potential advice state ^ defines a function : {0,1}" — [0, 1], by /^(a;) := Pr[Q(x,.^) = 1]. 
For each such ^, let f^{x) := [fs^{x) > 1/2] be the Boolean function obtained by rounding 
As a simplification, suppose that Merlin is restricted to sending an advice state ^ for which 
f^{x) ^ (1/3,2/3): that is, an advice state which renders a 'clear opinion' about every input 
X. (This simplification helps to explain the main ideas, but does not follow the actual proof.) Let 
S be the set of all Boolean functions / : {0,1}"" — {0,1} that are expressible as for some such 
advice state ^. Then S includes the 'target function' /* := L„ (the restriction of L to inputs of 
length n), as well as a potentially-large number of other functions. However, we claim 5 is not 
too large: IS"! < 2^°^^*^"^ This bound on the 'effective information content' of quantum states was 
derived previously by Aaronson [21 [5], building on the work of Ambainis et al. |12] . 

One might initially hope that, just by virtue of the size bound on S, we could find some set of 
poly(n) values 

(xi,/* (xi)) , . . . , {XkJ* (xt)) 

which isolate f* in S — that is, which differentiate /* from all other members of 5. In that case, 
the trusted classical advice could simply specify those values, as 'tests' for Arthur to perform on 
the quantum state sent by Merlin. Alas, this hope is unfounded in general. For consider the case 
where /* is the identically-zero function, and S consists of /* along with the 'point function' fy 
(which equals 1 on y and elsewhere), for all y G {0, 1}". Then /* can only be isolated in S by 
specifying its value at every point! 

Luckily, this counterexample leads us to a key observation. Although / is not isolatable in 
5 by a small number of values, each point function fy can be isolated (by its value at y), and 
moreover, fy is quite 'close' to /. In fact, if we choose any three distinct strings x, y, z, then 
/* = MA3 {fx, fy, fz)- (Of course if /* were the identically-zero function, it could be easily 
specified with classical advice! But /* could have been any function in this example.) 

This suggests a new, more indirect approach to our general problem: we try to express / as the 
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pointwise majority vote 

r (x) = MAJ {fi{x),...,fm{x)), 

of a small number (m = O (n), say) of other functions /i, • • • , /m in S, where each fi is isolatable 
in S by specifying at most k = O (log IS"!) of its values. Indeed, we will show this can always be 
done. We call this key result the majority -certificates lemma; we will say more about its proof and 
its relation to earlier work in Section 11.41 

With this lemma in hand, we can solve our (artificially simplified) problem: in the QMA/poly 
protocol for L, we use certificates which isolate fi, - ■ ■ ,fm S S as above as the classical advice 
for Arthur. Arthur requests from Merlin each of the m states Ci, . . . ,Cm such that fi = Z^^, and 
verifies that he receives appropriate states by checking them against the certificates. This involves 
multiple measurements of each — and an immediate difficulty is that, since measurements are 
irreversible in quantum mechanics, the process of verifying the witness state might also destroy it. 
However, we get around this difficulty by appealing to a result of Aharonov and Regev [10]. This 
result essentially says that a QMA protocol in which Arthur is granted the (physically unrealistic) 
ability to perform 'non-destructive measurements' on his witness state, can be efficiently simulated 
by an ordinary QMA protocol. 

To build intuition, we will begin (in Section [2]) by proving the majority-certificates lemma 
for Boolean functions, as described above. However, to remove the artificial simplification we 
made and prove Theorem O we will need to generalize the lemma substantially, to a statement 
about possibly-infinite sets of real- valued functions / : {0, 1}" — )• [0, 1]. In the general version, the 
hypothesis that S is finite and not too large gets replaced by a more subtle assumption: namely, 
an upper bound on the so-called fat- shattering dimension of S. To prove our generalization, we 
use powerful results of Alon et al. [11] and Bartlett and Long [13j on the learnability of real-valued 
functions. We then use a bound on the fat-shattering dimension of real-valued functions defined 
by quantum states (from Aaronson [5j, building on Ambainis et al. [12]). Figure [1] shows the overall 
dependency structure of the proof. 

1.4 Majority-Certificates Lemma in Context 

The majority-certificates lemma is closely related to the seminal notion of boosting [27] from com- 
putational learning theory. Boosting is a broad topic with a vast literature, but a common 'generic' 
form of the boosting problem is as follows: we want to learn some target function /*, given sam- 
ple data of the form {x,f* (x)). We assume we have a weak learning algorithm with the 
property that, for any probability distribution T) over inputs x, with high probability A finds a 
hypothesis f € J- which predicts /* (x) 'reasonably well' when x ~ P. The task is to 'boost' this 
weak learner into a strong learner i?-^* . The strong learner should output a collection of functions 
/i) ■ ■ ■ ) /m G such that a (possibly- weighted) majority vote over /i (x) , . . . , /m (x) predicts /* (x) 
'extremely well.' It turns out \27\ 118] that this goal can be achieved in a very general setting. 

Our majority-certificates lemma has strengths and weaknesses compared to boosting. Our 
assumptions are much milder than those of boosting: rather than needing a weak learner, we 
assume only that the hypothesis class S is 'not too large.' Also, we represent our target function 
/* exactly by MA J (/i, . . . ,fm), not just approximately. On the other hand, we do not give an 
efficient algorithm to find our majority-representation. Also, the /j's are not 'explicitly given': we 
only give a way to recognize each fi, under the assumption that the function purporting to be /, is 
in fact drawn from the original hypothesis class. 
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Figure 1: Dependency structure of our proof that quantum advice states can be expressed as ground 
states of local Hamiltonians. 



The proof of our lemma also has similarities to boosting. As an analogue of a 'weak learner', 
we show that for every distribution D, there exists a function f £ S which agrees with the tar- 
get function /* on most x ~ P, and which is isolatable in S by specifying 0(log|5|) queries. 
Using the Minimax Theorem, we then nonconstructively 'boost' this fact into the desired majority- 
representation of /*. We note that Nisan used the Minimax Theorem for boosting in a similar 
way, in his alternative proof of Impagliazzo's 'hard-core set theorem' (see [21]). 

The majority-certificates lemma is also reminiscent of Bshouty et al.'s algorithm |15J, for learning 
small circuits in the complexity class ZPP'^^. Our lemma lacks the algorithmic component of this 
earlier work, but unlike Bshouty et al., we do not require the functions being learned to come with 
any succinct labels (such as circuit descriptions). 

1.5 Organization of the Paper 

In Section [21 we prove the Boolean majority-certificates-lemma. In Section [Sj we give our real- 
valued generalization of this lemma, and in Section H] we use it to prove Theorem [21 and state 
some consequences for quantum complexity theory. Theorem [T] is proved in Section 14.31 Section [5] 
contains some further results for quantum complexity, and the Appendices provide some additional 
applications of and perspectives on the majority-certificates lemma. 

2 The Majority-Certificates Lemma 

A Boolean concept class is a family of sets {<S'ri,}^>^, where each Sn consists of Boolean functions 
/ : {0, 1}" — )• {0, 1} on n variables. Abusing notation, we will often use S to refer directly to a set 
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of Boolean functions on n variables, with the quantification over n being understood. 

By a certificate, we mean a partial Boolean function C : {0,1}" — >• {0,1,*}. The size of C, 
denoted |C|, is the number of inputs x such that C (x) G {0, 1}. A Boolean function / : {0, 1}" 
{0, 1} is consistent with C if f (x) = C (x) whenever C (x) G {0, 1}. Given a set S of Boolean 
functions and a certificate C, let S [C] be the set of all functions f & S that are consistent with C. 
Say that a function / G 5 is isolated in S by the certificate C if [C] = {/}. 

We now prove a lemma that represents one of the main tools of this paper. 

Lemma 3 (Majority-Certificates Lemma) Let S be a set of Boolean functions f : {0, 1}" — )• 

{0,1}, and let f* £ S. Then there exist m = O (n) certificates Ci, . . . ,Cm, each of size k = 
O (log IS"!), and functions fi, ■ ■ ■ , fm ^ S , such that 

(i) S[C,\ = {fi} allie[m\; 

(ii) MA J (/i {x),...Jra{x)) = f* (x) for all x G {0, 1}". 
Proof. Our proof of Lemma [3] relies on the following claim. 

Claim 4 Let T> be any distribution over inputs x G {0,1}". Then there exists a function f £ S 
such that 

(i) f is isolatable in S by a certificate C of size k = O (log \ S\); 

(U) Pr^^v[fix)^f*ix)]<^. 

Lemma [3] follows from Claim [5] by a boosting-type argument, as follows. Consider a two-player 
game where: 

• Alice chooses a certificate C of size k that isolates some f € S, and 

• Bob simultaneously chooses an input x G {0, 1}". 

Alice wins the game if / (x) = /* (x). Claim|4]tells us that for every mixed strategy of Bob (i.e., 
distribution V over inputs), there exists a pure strategy of Alice that succeeds with probability at 
least 0.9 against T>. Then by the Minimax Theorem, there exists a mixed strategy for Alice — that 
is, a probability distribution C over certificates — that allows her to win with probability at least 
0.9 against every pure strategy of Bob. 

Now suppose we draw Ci, . . . , Cm independently from C, isolating functions /i, . . . , in 5. 
Fix an input x G {0, 1}"; then by the success of Alice's strategy against x, and applying a Chernoff 
bound, 

Pr [MAJ(/i(x),...,/^(x))/r(x)]<i^, 

provided we choose m = O (n) suitably. But by the union bound, this means there must be a 
fixed choice of Ci, . . . , Cm such that MAJ (/i, . . . , fm) = f*{x)-, where each fi is isolated in S by 
Ci. This proves Lemma [3l modulo the Claim. ■ 

Proof of Claim 14]. By symmetry, we can assume without loss of generality that /* is the 
identically-zero function. Civen the mixed strategy T) of Bob, we construct the certificate C 
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as follows. Initially C is empty: that is, C (x) = * for all x £ {0, 1}". In the first stage, we draw 
t = O (log IS"!) inputs xi, . . . ,xt independently from V. For any / : {0, 1}" — )■ {0, 1}, let 

Wf:= Pr [/(x) = l]. 

Now suppose / is such that Wf > 0.1. Then 

Pr [/ (xi) = A • • • A / (xt) = 0] < 0.9* < ^, 

provided t > log^g/g 15*1- So by the union bound, there must be a fixed choice of xi, . . . ,Xf that kills 
off every f £ S such that wj > 0.1 — that is, such that / (xi) = ••• = / (xt) = implies Wf < 0.1. 
Fix that xi, . . . ,xt, and set C (xj) := for all i G [t]. 

In the second stage, our goal is just to isolate some particular function f £ S [C]. We do this 
recursively as follows. If |5'[C]| = 1 then we are done. Otherwise, there exists an input x such 
that / (x) = for some f G S [C] and / (x) = 1 for other f £ S [C]. If setting C (x) := decreases 
IS [C]| by at least a factor of 2, then set C (x) := 0; otherwise set C (x) := 1. Since S [C] can halve 
in size at most log2 |5| times, this procedure terminates after at most log2 15"! steps with [C]| = 1. 

The end result is a certificate C of size O (log \ S\), which isolates a function f in S for which 
"f^/ ^ 1/10. We have therefore found a pure strategy for Alice that fails with probability at most 
1/10 against V, as desired. ■ 

3 Extension to Real Functions 

In this section, we extend the majority-certificates lemma from Boolean functions to real-valued 
functions / : {0, 1}" — )• [0, 1]. We will need this extension for the application to quantum advice 
in Section SI In proving our extension we will have to confront several new difficulties. Firstly, 
the concept classes S that we want to consider can now contain a continuum of functions — so 
Lemma [31 which assumed that S was finite and constructed certificates of size ©(loglSI), is not 
going to work. In Section 13.11 we review notions from computational learning theory, including 
fat-shattering dimension and e-covers, which (combined with results of Alon et al. [11] and Bartlett 
and Long [13]) can be used to get around this difficulty. Secondly, it is no longer enough to isolate 
a function fi £ S that we are interested in; instead we will need to 'safely' isolate /», which roughly 
speaking means that (i) /j is consistent with some certificate C, and (ii) any f £ S that is even 
approximately consistent with C is close to /». In Section [321 we prove a 'safe winnowing lemma' 
that can be used for this purpose. Finally, in Section 13. 3[ we put the pieces together to prove a 
real-valued majority-certificates lemma. 

3.1 Background from Learning Theory 

A p-concept class S* is a family of functions / : {0, 1}" — )• [0, 1] (as usual, quantification over all n 
is understood). Given functions f,g : {0, 1}" — t- [0, 1] and a subset of inputs X C {0, 1}", we will 
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be interested in three measures of the distance between / and g restricted to X: 

Aoo (/, g) [X] := max \f{x)-g{x)\, 



A2{f,g) [X] := 



Ai(/,5) [X] := j;|/(x)-5(x)|. 

For convenience, we define Aoo (/, g) '■= Aqo (/, g) [{0, 1}"], and similarly for A2 (/, g) and Ai (/, g). 
Also, given a distribution V over {0, 1}"', define 

A,{f,9){V):= E [\f{x)-g{x)\]. 

x^T> 

Finally we will need the notions of coverability and fat-shattering dimension. 

Definition 5 (Coverability) Let S be a p-concept class. The subset C Q S is an e-cover for S 
if for all f £ S, there exists a g £ C such that Aqo {f,g) < £■ We say S is coverable if for all 
e > 0, there exists an e-cover for S of size 2^°^^^"''^^^\ 

Definition 6 (Fat-Shattering Dimension) Let S be a p-concept class and e > be a real num- 
ber. We say the set A C {0, 1}" is e-shattered by S if there exists a function r : A — )• [0, 1] such 
that for all 2^^^ Boolean functions g : A ^ {0, 1}, there exists a p-concept f £ S such that for all 
X £ A, we have f (x) < r (x) — e whenever g {x) = and f (x) > r (x) + e whenever g (x) = 1. 
Then the e-f at- shattering dimension of S, or fat^ (5), is the size of the largest set e-shattered by S. 
We say S is bounded- dimensional if fat ^ (S) < poly (n, 1/e) for all e > 0. 

The following central result was shown by Alon et al. [llj (see also |22j). 

Theorem 7 ([llj) Every p-concept class S has an e-cover of size exjp [{n -\-logl/e) ia.t^/4{S))~\. 
So in particular, if S is bounded- dimensional then S is coverable. 

Building on the work of Alon et al. [11], Bartlett and Long [13] then proved the following: 

Theorem 8 ( |13j ) Let S be a p-concept class and D be a distribution over {0,1}". Fix an 
f : {0, 1}" — )• [0, 1] (not necessarily in S) and an error parameter a > 0. Suppose we form a set 
X C {0, 1}" by choosing m inputs independently with replacement from T>. Then there exists a 
positive constant K such that, with probability at least 1 — 5 over X, any hypothesis h £ S that 
minimizes Ai (/i, /) [X] also satisfies 

Ai(/i,/) (P) <a+inf Al (gj) (V) , 



provided 

m 



>|(faV,(5)l„g'i + log^) 
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Theorem [8] has the fohowing corollary, which is similar to Corollary 2.4 of Aaronson [5], but 
more directly suited to our purposes hereO 

Corollary 9 Let S be a p-concept class and D be a distribution over {0, 1}". Fix an f G S and an 
error parameter e > 0. Suppose we form a set X CI {0, 1}" by choosing m inputs independently with 
replacement from T>. Then there exists a positive constant K such that, with probability at least 
1 — 5 over X, any hypothesis h £ S that satisfies Aqo {h, f) [X] < e also satisfies Ai (/i, /) (P) < lie, 
provided 

m > ^ (fat, (5) log2 i + log . 

Proof. Let S* be the p-concept class consisting of all functions g : {0, 1}" — )• [0, 1] for which 
there exists an f £ S such that Aoo {g, f) < £■ Fix an f G S and a distribution T>, and let X 
be chosen as in the statement of the corollary. Suppose we choose a hypothesis h € S such that 
^oo {h, f) [X] < e. Then there exists a function g £ S* such that g (x) = h (x) for all x £ X. 
This g is simply obtained by setting g (x) := h{x) if x £ X and g (x) := / (x) otherwise. In 
particular, note that Ai {h,g) [X] = 0, which means that h minimizes the functional Ai {h,g) [X] 
over all hypotheses in S (and indeed in S*). By Theorem [SI this implies that with probability at 
least 1 — S over X, 

Ai {h, g) {V) <a + inf Ai (n, g) {V) = a 

u&S* 

for all a > 0, provided we take 

m>^ ( fat„/5 (5*) log^ - + log ^ ) . 

Here we have used the fact that g £ S*, and hence 

inf Ai{u,g){V)=0. 

uGb* 

So by the triangle inequality, 

Ai {h, f) {V) < Ai {h, g) {V) + Ai (g, f) {V) 

< a + Aoo (gj) 

< a + e. 

Next, we claim that fat„/5 (5*) < fatQ,/5_g (S). The reason is simply that, if a given set /3-fat- 
shatters S* , then it must also (/3 — e)-fat-shatter S by the triangle inequality. Setting a := lOe 
now yields the desired statement. ■ 

3.2 The Safe Winnowing Lemma 

To prove the real-valued majority-certificates lemma, the first step is to prove a so-called 'safe 
winnowing lemma.' This lemma says intuitively that, given any set S of real- valued functions 

^It would also be possible to apply the bound from [5] 'off-the-shelf,' but at the cost of a worse dependence on 
1/e. 
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with a small e-cover (or equivalently, with polynomially-bounded fat-shattering dimension), it is 
possible to find a set of k = poly (n) constraints \f (xi) — ai\ < e, . . . , \f {xk) — ak\ < e that are 
essentially compatible with one and only one function f £ S. Here 'essentially' means that (i) any 
function that satisfies the constraints is close to / in Loo-norm, and (ii) / itself not only satisfies 
the constraints, but does so with a 'margin to spare.' 

Lemma 10 (Safe Winnowing Lemma) Let S be a set of functions f : {0, 1}" — >• [0, 1]. Fix a 

function /* G 5" and subset Y C {0, 1}". For some parameter e > 0, let C be a finite e-cover for 
S. Then there exists an f £ S, as well as a subset Z C {0, 1}" of size at most k = log2 \ C\, such 
that: 



(i) Every g £ S that satisfies Aqo (/,<?) U Z] < ^ also satisfies Aqo if,g) < 3e. 

(ii) Aoo(/,r)[y] <e/5. 

Proof. Let 5 := We construct {f,Z) by an iterative procedure. Initially let Sq := S, let 
/o := /*, and let Zq := Y. We will form new sets S*!, 52, ... by repeatedly adding constraints of the 
form / (x) < a or / (x) > a for various x, a, maintaining the invariant that ft S St- At iteration 
t, suppose there exists a function g G St-i such that Aqo {ft-i,g) [Y U Zf-i] < 5, but nevertheless 
\ft-i (zt) — g {zt)\ > 3e for some input zt. Then first set Zt := Zt-\ U {zj} (i.e., add zt into our set 
of inputs, if it is not already there). Let v ■.= \ \ft-\ (zt) + g (zt)], let A be the set of all functions 
h £ St-i such that h {zt) < v, and let B be the set of all h £ St-i such that h (zt) > v. Also, for 
any given set M, let := MnC. Then clearly min { |^^| , } < Sf_-^ /2. If |^^| < 
then set St := A; otherwise set St := B. Then set ft ■= ft-i if ft~i £ St and ft := g otherwise. 
Since Sf can halve at most k = log2 \C\ times, it is clear that after T < k iterations we have 

Sj, < 1. Set f := fr and Z := Zt- Then by the triangle inequality. 



Aoo (/, /*) [Y] <T5< 



and also 



\fizt)-ft{zt)\<{T-t)5<- 



for all t £ [T]. So suppose by contradiction that there still exists a function g £ St such that 
Aoo if,g) [Y U Z] < 6 but 1/ (x) — g {x)\ > 3e for some x, and consider functions p,q £ C in the 
cover such that Aoo {f,p) ^ £ and Aoo {g, q) < £• Then p,q £ S^ but p ^ q, which contradicts the 



fact that 
Aoo (/,<?) 



S^ < 1. Also notice that for all g £ S, if Aoo {f,g) [Y U Z] < 6 then g £ St- Thus 
Y L) Z] < 6 implies Aqo (/, g) < 3e as desired. ■ 



Note that Lemma[TO]is still interesting in the special case Y = 0, in which case /* is irrelevant, 
and the problem reduces to finding a Z such that every g £ S that satisfies Aoo (/, g) [Z] < ^ also 
satisfies Aoo (/> g) < 3e. In Appendix[10l we will develop the theory of 'winnowability' of p-concept 
classes for its own sake. We show there that the condition A^o (/, g) [Z] = O {e/k) can be improved 
to Ai (/,(?) [Z] = 0{e). On the other hand, the proof becomes more involved, and we no longer 
know how to incorporate /* and Y. We also show that the condition Aoo ifiO) [Z] = 0{e/k) 
cannot be improved to Aoo (/, g) [Z] = O (e) or even A2 (/, g) [Z] = O (e). 
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3.3 The Real- Valued Majority-Certificates Lemma 

We are finally ready to generalize Lemma [3] to the case of real- valued functions. 

Lemma 11 (Real Majority-Certificates) Let S be a p-concept class, let f* £ S, and let e > 0. 

Then for some m = O (n/e^), there exist functions fi, ■ ■ ■ , fm ^ S , sets Xi, . . . , Xm ^ {0, 1}" each 

of size k = o(^(n+ i^^s!V£^ fat^/^g (5)^ ^ and an a = ( (n+log i/e) fat (5) ) J^^^ which the following 

holds. All gi,. . . ,gm € 5 that satisfy Aqo (/«, gi) [Xi] < a for i e [m] also satisfy Aoo {f*,g) < £, 
where 

gi{x)^ ^gra {x) 



Proof. Let 



9\x) 



m 



t:=C7( n + log- )fat^ (5), 



Q := 



0.4/3 



where C is a suitably large constant. Also, let 5fin be a finite a-cover for S: that is, a finite subset 
Sfin ^ 5* such that for all f £ S, there exists a (7 E Sgn such that Aqo (/,<?) < aEI Given / and X, 
let S [/, X] be the set of all 5 G 5 such that Aoo (/, g) [X] < a. 

Now consider a two-player game where Alice chooses a function / G S^^ and a set X C {0, 1}" 
of size k, and Bob simultaneously chooses an input x € {0, 1}". Alice's penalty in this game (the 
number she is trying to minimize) equals 

sup \f* {x) - g{x)\ . 

9<^Slf,X] 

We claim that there exists a mixed strategy for Alice — that is, a probability distribution V over 
{f,X) pairs — that gives her an expected penalty of at most e/2 against every pure strategy of Bob. 

Let us see why the lemma follows from this claim. Fix an input x G {0, 1}", and suppose Alice 
draws {fi,Xi) , . . . , {fm, Xm) independently from V. Then for all i G [m], 



E 



sup \f*{x)-g{x)\ 



< 



Thus, letting zi,...,Zm be independent random variables in [0, 1], each with expectation at most 
e/2, the expression 



Pr 

is at most Pr [zi + 



3^1 G 5 [/i, Xi] , . . . , 5m G 5 [fm, Xr, 



r (x) 



gi{x)-\ \-gm (x) 



m 



> e 



> em] by the triangle inequality. This, in turn, is less than 



2exp 



2 {emy 



< 2^" 



m 



We will need San for the technical reason that the basic Minimax Theorem only works with finite strategy spaces. 
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by Hoeffding's inequality, provided we choose m = O {n/e^^ suitably. By the union bound, this 
means that there must be a fixed choice of fi, . . . , fm and Xi, . . . , Xm such that 



51 (x) H h 5m {x) 



m 



< e 



for all gi £ S [fi,Xi] , . . . , gm & S [fm, Xm] and all inputs x S {0, 1}" simultaneously, as desired. 

We now prove the claim. By the Minimax Theorem, our task is equivalent to the following: 
given any mixed strategy T> of Bob, find a pure strategy of Alice that achieves a penalty of at most 
e/2 against T>. In other words, given any distribution D over inputs x G {0, 1}", we want a fixed 
function / G Sgn, and a set X C {0, 1}" of size k, such that 



E 



sup \f*{x)-g{x) 



< 



We construct this (/, X) pair as follows. 

In the first stage, we let y be a set, of size at most 



M 



K 



-^^fat^(5)log^- + log- 



formed by choosing M inputs independently with replacement from T>. Here /? = e/48 as defined 
earlier, 5 = 1/2, and K is the constant from Corollary [9l Then by Corollary [9l with probability 
at least 1 — 5 = 1/2 over the choice of Y , any g £ S that satisfies Aqo if*,g) [Y] < /5 also satisfies 
^1 if*, 9) (^) ^ 11/3. So there must be a fixed choice of Y with that property. Fix that Y, and 
let S' be the set of all g £ S such that Aoo if*,g) [Y] < /3. 

In the second stage, our goal is just to winnow S' down to a particular function /. More 
precisely, we want to find an / G 5' n S^n, and a set X C {0, l}'^ containing Y, such that any g £ S 
that satisfies Aoo {f,g) [X] < a also satisfies Ago (/, 5) < 11/3. 

We find this {f,X) pair as follows. By Theorem [71 the class S' has a 4/3-cover of size 



N 



exp 



O 



n + log ^ ) fat^ [S'] 



< exp 



0{ (n + logi)fat^(5) 



Let t := log2 N. Then by Lemma [TOl there exists a function u G S' , as well as a subset Z C {0, 1}" 
of size at most t, such that: 



(i) A^{u,f*)[Y]<0.8f3. 

(ii) Every g £ S' that satisfies Aqo iu,g) [Y U Z] < also satisfies Aqo {u,g) < 12^. 
Let X :=Y U Z, and observe that 

\X\ = O fat;3 (S) log2 ^+(n + log fat^ (5)^ 
log^l/e^ 



O 



n + 



fate/48 (5') 
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as desired. Now let / be a function in Sa^ such that Aqo (/, u) < a- Let us check that / has the 
properties we want. First, 

Aoo (/*,/) [Y] < Aoo [Y] + Aoo (n, /) [Y] 

< 0.8/3 + a 

< 0.9/3, 

hence / G 5' as desired. Next, any g £ S that satisfies Aoo (/, d) [X] < a also satisfies 

Aoo {f\g) [Y] < Aoo (/*, /) [Y] + Aoo U,g) [Y] 

< 0.9/3 + a 

<f3, 

hence g € S' , hence Ai {f*,g) {V) < 11/3. So any g G S that satisfies Aoo [X] < « satisfies 

Aoo {u, g) [Z] < Aoo {u, f) [Z] + Aoo (/, g) [Z] 

< 2a 
_ 0.8/3 

hence Aqo {u,g) < 12/3 (since such a 5 must belong to 5"), hence 

Aoo (/, g) < Aoo (/, u) + Aoo {u, g) 

< a + 12/3 

< 13/3. 

To conclude, 

<Ai(r,/)(P)+ sup Aoo(/,5) 

< 11/3 + 13/3 
_ e 
~ 2 

as desired. This proves the claim and hence the lemma. ■ 

4 Application to Quantum Advice 

In this section, we use the real- valued majority-certificates lemma to prove Theorems [J and [2l as 
well as several other results. 

4.1 Bestiary of Quantum Complexity Classes 

Given a language L C {0, 1}*, let L : {0, 1}* — t- {0, 1} be the characteristic function of L. We now 
give a formal definition of the class BQP/qpoly. 



E 



sup |/*(2;) 
9&S[f,X\ 



9{x)\ 
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Definition 12 A language L is in BQP/qpoly if there exists a polynomial-time quantum algorithm 
A and polynomial p such that for all n, there exists an advice state pn on p (n) qubits such that 
A {x, Pn) outputs L (x) with probability > 2/3 for all x £ {0, 1}" . 

Closely related to quantum advice are quantum proofs. We now recall the definition of QMA 
(Quantum Merlin- Arthur), a quantum version of NP. 

Definition 13 A language L is in QMA if there exists a polynomial-time quantum algorithm A 
and polynomial p such that for all x G {0, 1}"; 

(i) If X £ L then there exists a witness px onp{n) qubits such that A[x.,px) accepts with proba- 
bility > 2/3. 

(ii) If X ^ L then A{x,p) accepts with probability < 1/3 for all p. 
We will actually need a generalization of QMA, which was called QMA^ by Aharonov and Regev 



Definition 14 A language L is in QMA^ if there exists a polynomial-time algorithm A, which 
takes X G {0,1}"" as input and produces quantum circuits Cx,i, ■ ■ ■ ,Cx.m o,nd rational numbers 
Tx,!, ■ ■ ■ , rx^m CIS output, as well as polynomials p, q such that for all x G {0, 1}"; 

(i) If X G L then there exists a witness px onp{n) qubits such that |Pr [Cx,i {px) accepts] — rx^i\ < 
1/^? (^) for all i G [m] . 

(ii) If X ^ L then for all p, there exists ani € [m] such that |Pr [Cx,i (p) accepts] — r^^i| > 5/q (n). 

Aharonov and Regev [9] made the following extremely useful observation, which we prove for 
completeness. 

Theorem 15 ([10]) QMA+ = QMA. 

Proof. QMA C QMA+ is obvious. For the other direction, let L G QMA+, and fix an input 
X G {0,1}", quantum circuits Cx,i, ■ ■ ■ ,Cx,m, rational numbers rx^i, ■ ■ ■ ,rx^m, and polynomials 
p,q. Then consider the following QMA verification procedure. Given a witness state a on 



(1) Choose i G [m] uniformly at random. 

(2) For A; := 1 to K, apply Cx,i to the k^^ register of a. 

(3) If the fraction a of invocations that accepted satisfies \a — rx,i\ < 2/q (n), then accept. Oth- 
erwise reject. 

''Aharonov and Regev actually defined QMA_|_ in a slightly more general way. However, the definition below is all 
we need; note that all these classes turn out to equal QMA anyway. 
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Let Pi {a) be the probability that the above procedure accepts, conditioned on choosing i E [m] 
in step (i). 

Completeness is easy: an honest Merlin can send Arthur the product state plf^- Then provided 
we take K sufficiently large, Pi (p®^) < 1/g (n)^ for all i G [m] by a Chernoff bound. 

In the soundness case, suppose by way of contradiction that there exists a state a such that 

E [P,(^)]<-^. 

Then by Markov's inequality. Pi (a) < 2/q (n) for each particular i G [m]. Now let ak be the k^^ 
register of a, and let p '■= (ci + • • • + (Jr)- Then by linearity of expectation. 



|Pr [Cx,i (p) accepts] 



E [Pr [Cx,i {(Jk) accepts] 

<^ + P,{a) 
q{n) 

4 

< 



g(n)' 

which contradicts the assumption that there exists an i G [m] such that 

5 



|Pr [Cx,i (p) accepts] - rr,^i\ > 



(n)' 



The theorem now reduces to the standard fact that QMA protocols can be amplified to any desired 
1/ poly (n) soundness gap. ■ 

To state our results, it will be helpful to have the further notion of untrusted advice, which is like 
advice in that it depends only on the input length n, but like a witness in that it cannot be trusted. 
This notion has been studied before: Chakaravarthy and Roy [16] and Fortnow and Santhanam 
|17j defined the complexity class ONP ('Oblivious NP'), which is like NP except that the witness 
can depend only on the input length. Independently, Aaronson [5] defined the complexity class 
YpH which is easily seen to equal ONP n coONP. We will adopt the 'Y' notation in this paper, 
because it is much easier to write YQP/poly (for example) than OQMA/poly n coOQMA/poly. 

We now give a formal definition of YP, as well as a slight variant called YP*. 

Definition 16 A language L is in YP if there exist polynomial-time algorithms A, B and a poly- 
nomial p such that: 

(i) For all n, there exists an advice string yn G {0, l}^'-"-' such that A{x,yn) = 1 for all x G 

{o,ir. 

(a) If A (x, y) = 1, then B (x, y) = L [x). 
L is in YP* if moreover A ignores x, depending only on y. 



®YP stands for 'Yoda Polynomial-Time,' a nomenclature that seems to make neither more nor less sense than 
'Arthur-Merlin.' 
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Clearly P C YP* C YP C P/poly n NP n coNP. Also, Aaronson [5] showed that ZPP C YP. We 
will be interested in the natural quantum analogues of YP and YP*: 

Definition 17 A language L is in YQP if there exist polynomial-time quantum algorithms A,B 
and a polynomial p such that: 

(i) For all n, there exists an advice state pn on p{n) qubits such that A{x,pn) accepts with 
probability > 2/3 for all x S {0, l}*^. 

(a) If A (x, p) accepts with probability > 1/3, then B {x, p) outputs L (x) with probability > 2/3. 

L is in YQP* if moreover A ignores x, depending only on p. 

Clearly BQP C YQP* C YQP C BQP/qpoly n QMA n coQMA. By direct analogy to QMA+, we 
can define the following generalizations of YQP and YQP*: 

Definition 18 A language L is in YQP_|_ if there exists a polynomial-time algorithm A, which 
takes X G {0,1}" as input and produces quantum circuits Cx,i, ■ ■ ■ ,Cx,m o,nd rational numbers 
i^x,i, ■ ■ ■ ,Tx,m 0-s output; a polynomial-time quantum algorithm B; and polynomials p,q such that: 

(i) For alln, there exists an advice state pn onp (n) qubits such that |Pr [Cx^i (pn) accepts] — rx^i\ < 
1/q (n) for all i G [m\ and x G {0, 1}". 

(a) If \PT[Cx,i{p) accepts] — rx,i\ < b/q{n) for all i G [ni\, then B{x,p) outputs L{x) with 
probability > 2/3. 

L is in YQP^ if moreover A ignores x. 

Then we have the following direct counterpart to Theorem 1151 
Theorem 19 YQP+ = YQP and YQP; = YQP*. 

Proof. For YQP C YQP+ and YQP* C YQP+, we simply take m = 1 and take q{n) to be a 
constant. For YQP-|- C YQP, the simulation procedure is essentially the same as in the proof of 
Theorem 1151 Namely, let L G YQP+, and fix an input x G {0, 1}", quantum circuits Cx,i, • • • , Cx,m 
generated by an algorithm A, rational numbers rx^i, ■ ■ ■ ,rx,m, polynomials p,q, and an algorithm 

B. Then given a witness state a on K = O (^q (n)^ logg (n)^ registers, the YQP algorithm A' does 
the following: 

(1) Choose i G [m] uniformly at random. 

(2) For k := 1 to K, apply Cx,i to the k^^ register of a. 

(3) If the fraction a of invocations that accepted satisfies \a — rx^i\ < 2/g (n), then accept. Oth- 
erwise reject. 

Likewise, let ak be the fc*^ register of a. Then the YQP algorithm B' chooses k G [K] uniformly 
at random, runs B (x, a^), and outputs the result. One can check that conditions (i) and (ii) in the 
definition of YQP are both satisfied, albeit with 1 — 1/q (n)^ and 1 — 2/q (n)^ in place of 2/3 and 
1/3 (which is not an important difference, because of amplification). The proof of YQP!j_ C YQP* 
is the same, except that both A and A' ignore the input x when generating the Cx/s and rx/s. ■ 
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4.2 Characterizing Quantum Advice 

Fix a polynomial-size quantum circuit Q. For a given advice state p, let fp [x) := Pr [Q accepts x, p\. 
Let S be the p-concept class consisting of fp for all p (n)-qubit mixed states p. Then Aaronson [5j 
proved the following. 

Theorem 20 ([5]) fat^ {S) = O {p (n) /j"^) . 

We now prove the following characterization of BQP/qpoly, which immediately implies (and 
strengthens) Theorem [2) 

Theorem 21 BQP/qpoly = YQP/poly. 

Proof. One direction (YQP/poly C BQP/qpoly) is obvious, since untrusted quantum advice 
and trusted classical advice can both be simulated by trusted quantum advice. We prove that 
BQP/qpoly C YQP/poly. It suffices to show that BQP/qpoly C YQP+/poly, since YQP = YQP+ by 
Theorem [T9j Let L G BQP/qpoly, let Q be a quantum algorithm that decides L with completeness 
and soundness errors 1/5, and let x G {0, 1}" be the input. Also, let (z) := Pr [Q {z, ^) accepts], 
where ^ is ap (n)-qubit quantum advice state for Q. Then by definition, there exists a 'true' advice 
state pn such that 

|/p„ {z)-L{z)\< 0.2 

for all z G {0, 1}". Let S be the p-concept class consisting of for all p(n)-qubit mixed states 
Then Theorem [20] implies that fat^ (5) = 0{p{n)/-/'^) for all 7 > 0. Set 7 := 1/480. 
Then by Lemma [TTl for some m = O (n), there exist p (n)-qubit mixed states p[l] , . . . , p [m], sets 
Xi, . . . , Xm ^ {0, 1}" each of size k = O {n ■ p (n)), and an a = (^ n-p(n) ) which the following 
holds: 

(*) All p {n)-qubit states a[l\ , . . . ,a [m] that satisfy Aqo {fp[i\, fa[{\) [^i] ^ 5a for i G [m] also 
satisfy Aoo {fp„,fa) < 0.1, where a := ^{a[l]-\ h fx [m]). 

Our YQP^^/poly simulation is now the following. The classical /poly advice encodes the sets 
Xi, . . . ,Xm, as well as a rational approximation rj^^ to /p[j] (z) for each i £ [m] and z £ Xi. The 
untrusted quantum advice p'^ consists of m registers of p (n) qubits each; in the honest case, p'^ is 
simply p [I] ■ ■ ■ p [m]. Let a [i] be the i*'' register of p'^. Then given the advice, the YQP_|_ 
machine A outputs a circuit Ci^z that runs Q {z, a [i]) and outputs the result, for each i G [m] and 
z £ Xi. The machine B chooses i G [rri\ uniformly at random, then runs Q {x, a [i]) and outputs 
the result. 

We are interested in the difference between Pr [Ci^z (Pn) accepts] and rj^^. In the honest case, 

Pr [d^z {p'n) accepts] = Pr [Q {z, p [i]) accepts] = fp^ {z) 

for all i, z. Moreover, we can easily arrange each Vi^z to be within a of /p[j] (2), by using O (logn) 
bits to specify each r^^^. For the soundness case, suppose 

|Pr [d^z {p'n) accepts] - rj,^| < 5a 
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for all i £ [m] and z G Xj. Then by (*), we have A^o {fp„, fa) ^ 0.1. Notice that by linearity of 
expectation, 

Pr [B accepts] = E [Pr [Q (x, a [i]) accepts]] = fa- (x) , 

and that this holds regardless of what entanglement might be present among the m registers 
a[l] , . . . ,a [m]. Hence 

|Pr [B accepts] -L{x)\<\Fi[B accepts] - /p„ (x)| + |/p„ (x) - L {x)\ 

< 0.1 + 0.2 

which is less than 1/3 as desired, and L G YQP+/poly = YQP/poly. ■ 

Theorem 1211 actually yields the stronger result that BQP/qpoly C YQP*/poly, since the machine 
A had no dependence on the input x. We therefore have BQP/qpoly = YQP*/poly = YQP/poly: the 
two definitions of YQP collapse in the presence of polynomial-size classical advice. Since we never 
needed the assumption that the BQP/qpoly machine computes a language (i.e., a total Boolean 
function), another strengthening we can easily observe is PromiseBQP/qpoly = PromiseYQP/poly. 

If we prefer, we can interpret Theorem 12 II as a statement about quantum communication proto- 
cols rather than quantum complexity classes. The following theorem makes this connection more 
precise. 

Theorem 22 Suppose that Alice, who is computationally unbounded, has a classical description of 
an n-qubit quantum state p. She wants to send p to Bob, who is limited to BQP computations. 
Alice has at her disposal a noiseless one-way classical channel to Bob, as well as a noisy one-way 
quantum channel. Then for all m and e > 0, there exists a protocol whereby 

(i) Alice sends Bob a classical string y o/poly (n, m, 1/e) bits, as well as a state a o/poly (n, m, 1/e) 
qubits. 

(a) Bob receives y together with a possibly- corrupted version a of a. 

(Hi) Ifa = a, then for any measurement E performed by a circuit with at most m gates. Bob can 
perform another measurement fy {E) on a, and then output a number j3 G [0, 1] such that 
1/3 — Tr {Ep)\ < e with 1 — 1/ exp (n) probability. Here fy (E) can be computed in polynomial 
time given y together with a description of E. 

(iv) For every a and every such measurement E, with 1 — 1/ exp (n) probability Bob outputs either 
'FAIL' or else a /? G [0, 1] such that |/3 - Tr {Ep)\ < e. 

Proof. This is just a direct translation of Theorem [2T] to the communication setting. The string 
y plays the role of the trusted classical advice, the state a plays the role of the untrusted quantum 
advice, the measurement E plays the role of the input x G {0, 1}"", and Bob plays the role of the 
verifier. To get 1 — 1/ exp (n) success probability, we amplify the protocol O (n) times, which just 
makes y and a polynomially longer. ■ 
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4.3 The Complexity of Preparing Quantum Advice States 

If we combine Theorem 1211 with known QMA-completeness results, we can obtain a striking con- 
sequence for quantum complexity theory. Namely, the preparation of quantum advice states can 
always be reduced to the preparation of ground states of local Hamiltonians — despite the fact that 
quantum advice states involve an exponential number of constraints, while ground states of local 
Hamiltonians involve only a polynomial number. In particular, if ground states of local Hamilto- 
nians can be prepared by polynomial-size circuits, then we have not only QMA = QCMA, but also 
BQP/qpoly = BQP/poly. The following theorem makes this connection precise. 

Theorem 23 Let Q be a polynomial- size quantum circuit that takes an advice state pn- Then 
there exists another polynomial-size quantum circuit Q' with the following property. For all n and 
e > 0, there exists a 2-local Hamiltonian H on poly (n, 1/e) qubits, such that for all ground states 
\(p) of H and inputs x G {0, 1}", 

|Pr \Q' accepts x, |(/>)] — Pr [Q accepts x,pn]\ < £• 

Furthermore, Q' can be efficiently generated given Q together with a description of H . 

Proof. Kempe, Kitaev, and Regev [23] proved that the 2-LoCAL Hamiltonians problem is QMA- 
complete. Furthermore, examining their proof, we find that it yields the following stronger result. 
Let y be a QMA verification procedure with completeness and soundness errors 5. Then there 
exists a 2-local Hamiltonian H, as well as a polynomial-time 'recovery procedure' such that if 
\4>) is any ground state of ff, then with O (1/ poly (n)) probability, R{\(j))) outputs a state \(p) such 
that Pr \y accepts \(p)]>l — 5. To prove the stronger result: consider a ground state of H, which 
Kempe et al. show to be a history state of the form 

i'A> = ^Eit)i'/>t)- 

t=i 

Then R can simply measure the clock register postselect on obtaining the outcome i = 1, and 
then retrieve \ip) from the computation register |<;/>i). 

Indeed, we can strengthen the above result further, to increase -R's success probability from 
17 (1/poly (n)) to 1 — 6. To do so, we simply increase the number of steps T by a 1/5 factor, then 
put additional terms in to impose the constraint that the computation should do nothing for 
the first (1 — (5) T steps (leaving unchanged), and only apply V during the final 5T steps. 

Now let Q be a polynomial-size quantum circuit that takes advice state pn, and let {A,B) be 
the YQP/poly checking algorithm (with error parameter 6) from Theorem [2TJ Then by the above, 
there exists a 2-local Hamiltonian H on poly (n, 1/5) qubits, as well as a polynomial-time algorithm 
R, such that 

(i) If \(j)) is any ground state of then with at least 1 — 5 probability, R[\(j))) outputs a state 
\ip) such that Pr [A accepts |(^)] > 1 — 5. 

(ii) This \ip) satisfies \B^p {x) — Qp^ {x)\ < 5 for all x £ {0, 1}", where (x) := Pr [B accepts x, \(p)] 
and Qp^ (x) := Pr [Q accepts 
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We can now combine R and B into a single algorithm Q' , such that Q'^ (x) — Qp^ (x) < 25 for 
all X G {0, 1}". Setting 6 := e/2 then yields the corollary. ■ 

Let us make two remarks about Theorem I23[ First, as a 'free byproduct,' we get that 

|Pr [Q' accepts x, |(/))] — Pr [Q accepts x,/9„]| < 2e 

for all that are e-close in trace distance to a ground state of H. Second, there is nothing 
special here about 2-LoCAL Hamiltonians. So far as we know, all existing QMA-completeness 
reductions have the property we needed for Theorem [23l namely, the property that any ground 
state of the new instance can be transformed into a QMA witness for the original instance, with 
(1/ poly (n)) success probability. As one example, Aharonov et al. [7j showed that even finding 
the ground state energy of a nearest-neighbor Hamiltonian on the line is QMA-complete, provided 
the line is composed of qudits with d > 12. We can combine their result with Theorem 1211 to show 
that for all L G BQP/qpoly, there exists a nearest-neighbor qudit Hamiltonian H on the line, such 
that any ground state of if is a valid quantum advice state for L. 

Proof of Theorem [ll Fix c,e > 0, and let p be the n-qubit state in Theorem[TJ Let Q{C,^) be 
an efficiently constructible polynomial-size quantum circuit that takes a description of a quantum 
measurement circuit C of size n^, as well as a quantum state ^ of n qubits, and that outputs the 
measurement result C(^). 

Fix pn := p. Let H be the 2- local Hamiltonian given by Theorem 123^ with ground state {ip), 
and let Q'{C,(^) be the circuit in Theorem 1231 which is efficiently constructible given Q and H. 
Then, if we define the measurement C as C"(.^) := Q'{C,^), we have 

\C' (IV) (VI) - C{p)\ = \Q' (C, IV) (VI) - Q{C,p)\ < e. 



5 Further Implications for Quantum Complexity Theory 

In this section, we use the BQP/qpoly = YQP/poly theorem to harvest two more results about 
quantum complexity classes. The first is an 'exchange theorem' stating that QCMA/qpoly C 
QMA/poly: in other words, one can always simulate quantum advice together with a classical witness 
by classical advice together with a quantum witness. This is a straightforward generalization 
of Theorem [2TJ The second result is a 'Quantum Karp-Lipton Theorem,' which states that if 
NP C BQP/qpoly (that is, NP-complete problems are efficiently solvable by quantum computers 
with quantum advice), then C qmaP™'^'^^^^^, which one can think of as 'almost as bad' as a 
collapse of the polynomial hierarchy. This result makes essential use of Theorem 12 H and is a good 
illustration of how that theorem can be applied in quantum complexity theory. 

Theorem 24 (Exchange Theorem) QCMA/qpoly C QMA/poly. 

Proof. The proof is almost the same as that of Theorem 1211 Let L G QCMA/qpoly. Then there 
exists a polynomial-time quantum verifier Q, a family of polynomial-size advice states {pn}n^ ^^'^ 
a polynomial p such that for all inputs x G {0, 1}": 
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, xeL ^ 3w e {0, 1}^^"'' Pi[Q{x,w,pn) accepts] > 2/3. 
, x^L ^ Vwe{0,l}P^"^ PT[Q{x,w,pn) accepts] < 1/3. 

Now consider the following promise problem: given x and w as input, as well as a constant 
c G [0, 1], decide whether Pr [Q {x,w,pn) accepts] is at most c — 1/10 or at least c+ 1/10, promised 
that one of these is the case. (Equivalently, estimate the probability within an additive error 
±1/10.) This problem is clearly in PromiseBQP/qpoly, since we can take pn as the advice. So by 
Theorem 12 It the problem is in PromiseYQP/poly as well. 

We claim this implies L € QMA/poly. For our QMA/poly verifier can take the PromiseYQP/poly 
advice string as its advice, and a state of the form a 05 {w) {wl as its witness. It can then do 
the following: 

(1) Using an, check that a is a valid witness for the PromiseYQP/poly protocol, and reject if not. 

(2) Using a, check that Pi [Q (x , w , pn) accepts] > 2/3 (under the promise that it is either at 
least 2/3 or at most 1/3). 



Indeed, let YQ-QCMA denote the complexity class where a BQP verifier receives a classical 
witness that depends on the input, as well as a quantum witness that depends only on the input 
size n. Then we can characterize QCMA/qpoly as equal to YQ-QCMA/poly, similarly to how we 
characterized BQP/qpoly as equal to YQP/poly. 

We now use Theorem [2T] to prove an analogue of the Karp-Lipton Theorem for quantum advice. 

Theorem 25 (Quantum Karp-Lipton Theorem) If MP C BQP/qpoly, theriH^ C QMA^^^^'^^^ 

Proof. By Theorem EH the hypothesis implies NP C YQP/poly = YQPVpoly. So let Q be a 
YQP*/poly algorithm to decide SAT, which takes an input x, a trusted advice string a, and an 
untrusted advice state p. Let a* be the correct value of the advice string. 

Now consider an arbitrary language -L G flj, which is defined by a polynomial-time predicate 
R {x, y, z) like so: 

• X £ L yy3z R {x,y, z). 

Using Q, we can create a pair of quantum algorithms Qi {a,p), Q2 {p,x,y) with the following 
properties: 

(PI) There exists a p such that Pr [Qi {a* , p) accepts] > 2/3. 

(P2) If Pr [Qi {a* , p) accepts] > 1/3, then for all x,y pairs, Pr [Q2 {p,x,y) accepts] > 2/3 if there 
exists a z such that R{x,y,z) holds, and Pr [(^2 {p,x,y) accepts] < 1/3 otherwise. 

Using standard amplification and NP self-reducibility, we can then strengthen property (P2) to 
the following, for some quantum algorithm Q'2 {p,x,y): 

(P2') If Pr [Qi {a* , p) accepts] > 1/3, then for all x,y pairs, Q2{p,x,y) outputs a z such that 
R{x,y,z) holds with probability at least 2/3, whenever such a z exists. 
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Figure 2: Containments among complexity classes related to quantum proofs and advice, in light of 
this paper's results. The containments QMA/qpoly C PSPACE/poly and QCMA/qpoly C PP/poly 
were shown previously by Aaronson [4J. This paper shows that BQP/qpoly C QMA/poly, and 
indeed BQP/qpoly = YQP/poly, where YQP is like QMA except that the quantum witness can 
depend only on the input length n. It also shows that QCMA/qpoly C QMA/poly. 



Now let \J (a, p, x, y) be a quantum algorithm that does one of the following, both with equal 
probability: 

• Runs Q\ {a,p), and accepts if and only if it rejects. 

• Runs Q'2 (p, X, y), and accepts if and only if R {x, y, Q2 (p, x, y)) holds. 

Then we claim that 

(Al) X £ L =^ 3a, p[Fr [Qi {a, p) accepts] > 2/3] A [Vex, y Pr [f/ {a,a,x,y) accepts] > 1/3]. 
(A2) X ^ L =^ Va, p [Pr [Qi (a, p) accepts] < 1/2] V [3a, y Pr [U (a, a, x, y) accepts] < 1/4]. 

It is clear that this claim implies L G Q|\/|/\P''omiseQMA^ (The crucial point here is that U does 
not take the existentially-quantified advice state p as input — and therefore, the QMA machine does 
not need to pass a quantum state to the PromiseQMA oracle, which would be illegal. This is why 
we needed the BQP/qpoly = YQP*/poly theorem for this result.) 

We now prove the claim. First suppose x G L. Then there exists an advice string a = a* with 
the following properties: 

(Bl) There exists a p such that Pr [Qi {a, p) accepts] > 2/3. (By (PI).) 

(B2) For all (T,y pairs, either Pr [Qi {a, a) rejects] > 2/3, or Y'v[R{x,y,Q'2 {cr,x,y)) holds] > 2/3. 
(By (P2') and x G L.) 
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By (B2), we have Vcr, y Fr[U {a,a,x,y) accepts] > 1/3. This proves (Al). 

Next suppose x ^ L. Then given an advice string a, suppose there exists a p such that 
Pi [Qi {a, p) accepts] > 1/2. Set a := p, and choose a y for which there is no z such that 
R (x, y, z) holds. Then 

• Pr[(5i(a, cj) rejects] < 1/2. (By assumption.) 
. Pr [R (x, y, Q!, (a, x, y)) holds] = 0. (By x i L.) 

Combining these, Pr \U {a,a,x,y) accepts] < 1/4. This proves (A2), and hence the claim, and 
hence the lemma. ■ 

Previously, Aaronson [3j showed that if PP C BQP/qpoly, then the counting hierarchy CH 
collapses. However, he had been unable to show that NP C BQP/qpoly would have unlikely 
consequences in the uniform world. 

6 Open Problems 

One open problem is simply to find more applications of the majority-certificates lemma, which 
seems likely to have uses outside of quantum complexity theory; we mention one application (to 
'untrusted oracles') in Appendix El Can we improve the parameters of the majority-certificates 
lemma (the size of the certificates or the number O (n) of certificates), or alternatively, show that 
the current parameters are essentially optimal? Also, can we prove the real-valued majority- 
certificates lemma with an error tolerance a that depends only on the desired accuracy e of the 
final approximation, not on n or the fat-shattering dimension of S? 

On the quantum complexity side, we mention several questions. First, in Theorem [23l is 
the polynomial blowup in the number of qubits unavoidable? Could one hope for a way to 
simulate an n-qubit advice state by the ground state of n-qubit local Hamiltonian, or would that 
have implausible complexity consequences? Second, can we use the ideas in this paper to prove 
any upper bound on the class QMA/qpoly better than the PSPACE/poly upper bound shown by 
Aaronson 0? Third, if NP C BQP/qpoly, then does QMA'^''"'^''^^^^ contain not just but the 
entire polynomial hierarchy? Finally, is BQP/qpoly = BQP/poly? 
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8 Appendix: Untrusted Oracles 

In this appendix, we give an interesting consequence of the majority-certificates lemma for classical 
complexity theory. 

When we give a machine an oracle, normally we assume the oracle can be trusted. But it is 
also natural to consider untrusted oracles, which are nevertheless restricted in their computational 
power. We formalize this notion as follows: 

Definition 26 (Untrusted Oracles) Let C and T> be complexity classes. Also, given a family 
o- = {ci,n]n>i of p{n)-bit advice strings and a machine V, let V [a] be the language decided by V 
given a as advice. Then C^"'^''"^'^®^"^ is the class of languages L for which there exists a C machine 
U, aV machine V, and a polynomial p such that for all n: 

(i) There exist p {n)-bit advice strings ai, . . . , am such that decides L. 
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(a) [/^[ai],.--,l^[am] (-3,-) outputs either L{x) or 'FAIL', for all inputs x G {0,1}" and all p{n)-bit 
advice strings ai, . . . , am- 

We can now state the consequence. 

Theorem 27 Let C be a uniform syntactic complexity class, such as P, NP, or EXP. Then 
C/polyC(ACO)^"™'"''. 

Proof. Let y be a C/poly machine that uses a family a = {an}„>i of p(n)-bit advice strings. Fix 
an input length n, and let fy^ (x) be the output of V on input x and advice string w G {0, 
Then S = {/wj^gjo a Boolean concept class of size |5| < 2P°'y("). So by Lemma [31 there 

exist m = O (n) polynomial-size certificates Ci, . . . ,Cm, which isolate functions /i,...,/m £ S 
respectively such that MA J (/i, . . . , fm) = fa„- Now, we can easily modify the proof of Lemma [3] 
to ensure not only that MA J (/i, . . . , fm) = /*, but also that 

fa„ (x) = 1 ^ /l (X) + • • • + (X) > — , 

777 

(x) = 0^/i(x) + --- + /„(x)<- 

for all inputs x. To do so, we simply take m = O (n) sufficiently large and redo the Chernoff bound. 
Furthermore, it is known that Approximate Majority — that is, Majority where the fraction 
of I's in the input is bounded away from 1/2 by a constant — can be computed by polynomial-size 
depth-3 circuits, so in particular, in AC'' (see Viola pO] for example). 

By hardwiring the certificates Ci, . . . , Cm into the AC*^ circuit, we can produce an AC*^ circuit 
that first checks whether fi is consistent with Ci for all i G [m], outputs 'FAIL' if not, and otherwise 
outputs [//l'->/'" (x) = fa„ (x). ■ 

If C is a semantic complexity class, such as BPP or UP, the difficulty is that there might be a 
C/poly machine M and advice string w for which the function fyj is undefined (since M need not 
decide a language for every w). However, if we force the Untrusted— C oracle to restrict itself to 
w for which fyj is defined, then Theorem 1271 goes through for semantic classes as well. Using the 
rea/- valued majority-certificates lemma that we develop in Section [3l it is possible to remove the 
assumption that fw is defined for all w for semantic classes such as BPP. 

9 Appendix: Isolatability and Learnability 

The following definition abstracts a key notion from the majority-certificates lemma. 

Definition 28 (Majority-Isolatability) A Boolean concept class S is majority-isolatable if for 
every f £ S, there exist m = poly (n) certificates Ci, . . . , Cm, each of size poly (n), such that 

(i) S[Ci\ is nonempty for all i G [m], and 

(a) if fi G 5* [Cj] for all i G [m], then MAJ (/i, . . . , /m) = /, where MAJ denotes pointwise 
majority. 
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We now show that the majority-isolatabihty of a Boolean concept class S is equivalent to 
a large number of other properties of S — including having singly-exponential cardinality, having 
polynomial VC-dimension, being PAC-learnable using poly (n) samples, and being 'winnowable.' 
While we do not need this equivalence theorem elsewhere in the paper, it might be of interest 
anyway. Note that the equivalence theorem is easily seen to break down for concept classes with 
infinite input domains. 

Definition 29 (VC-dimension) We say a Boolean concept class S shatters the set A C {0, 1}" 

if for all 21^1 functions g : A ^ {0, 1}, there exists an f G S whose restriction to A equals g. Then 
the VC-dimension of S, or VCdim(S), is the size of the largest set shattered by S. 

Given a distribution V over {0,1}", we say the Boolean functions f,g : {0,1}" — {0,1} are 
{V, e) -close if 

Pr [g{x) = f{x)] > 1-e. 

Definition 30 (Learnability) 5 is learnable if for all f £ S, distributions T>, and e,6 > 0, 

there exists an m = poly (n, 1 /e, log 1 /6) such that with probability at least 1 — 5 over sample points 
X\ 1 ■ ■ ■ 1 Xjn drawn independently from V, every g £ S satisfying g (xi) = f (xi) , . . . 

; 9 (-^m) — / {Xm) 

is {V, e)-close to f . 

We can also define 'approximability,' which is like learnability except that the choice of training 
examples can be nondeterministic: 

Definition 31 (Approximability) S is approximable if for all f G S and distributions T>, there 
exists a certificate C of size poly(n, 1/e) such that every g € S [C] is {T>,£)-close to f. 

Finally, let us call attention to a notion that implicitly played a major role in the proof of 
Lemma [3l 

Definition 32 (Winnowability) S is winnowable if for all nonempty subsets S' C S, there exists 
a certificate C of size poly (n) such that \S' [C]\ = 1. 

We can now prove the equivalence theorem. 

Theorem 33 Let S be a Boolean concept class. Then \S\ < 2P°'y(") ijff VCdim(5) < poly (n) iff 
S is learnable iff S is approximable iff S is majority-isolatable iff S is winnowable. 

Proof. IS"! < 2P°'y(") =^ VCdim {S) < poly (n) follows from the trivial upper bound VCdim (5) < 
log2|S|. 

VCdim (5) < poly(n) =^ \S\ < 2P°iy(") is Sauer's Lemma [26] , which implies the relation 
\S\ < 2"'^'-^<^™('^) 

15-1 < 2Poiy{n) ^Learnable was proved by Vahant [28] . 

Learnable^=^ Approximable is immediate, and Approximable^=^ VCdim (S) < poly (n) 
follows from a counting argument (see Blumer et al. [11] for details). 
|5| < 2P°'y(") ^Majority-IsoIatable was the content of Lemma [3l 

Majority-Isolatable^=^ |5| < 2^°^^^"'^ follows from another counting argument: if S is 
majority-isolatable, then every f £ S is uniquely determined by poly (n) certificates Ci, . . . ,Cm, 
each of which can be specified using poly (n) bits. 
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For IS"! < 2P°iy(") ^Winnowable, let S' C S. Then as in the proof of Lemma [3l we can use 
binary search to winnow S' down to a single function f £ S', which yields a certificate of size at 
most log2 |5"| < log2 \S\. 

For Winnowable^ |5| < 2P°iy("), we prove the contrapositive. Suppose IS"! > 2**^"^ for some 
superpolynomial function t{n) (at least, for infinitely many n). Then define a subset 5" C S by 
the following iterative procedure. Initially S' = S. Then so long as there exists a certificate C of 
size at most t (n) / (2n + 2) such that \S' [C]\ = 1, remove the function f £ S' [C] from S' , halting 
only when no more such 'isolating certificates' can be found. 

The number of certificates of size k is at most 2^"'"''^)'^', and a given certificate C can only be 
chosen once, since thereafter S' [C] is empty. So when the above procedure halts, we are left with 
a set S' such that \S'\ > 2**^") — 2("-+^)*(")/(2"+2) > g. Furthermore, for every function / remaining 
in S' , there can be no polynomial-size certificate C such that S' [C] = {/} — for if there were, then 
we would already have eliminated / in the process of forming S' . Hence S is not winnowable. ■ 



10 Appendix: Winnowing of p-Concept Classes 

In this appendix, we look more closely at the problem solved by Lemma [TOl (the 'Safe Winnowing 
Lemma'), and ask in what senses it is possible to winnow a p-concept class down to 'essentially' 
just one function. The answer turns out to be interesting, even though we do not need it for our 
quantum complexity applications. 

We first give a definition that abstracts part of what Lemma [TU] was trying to accomplish. 

Definition 34 (Winnowability) A p-concept class S is Li-winnowable if the following holds. 
For all nonempty subsets S" C S and e > 0, there exists a function f £ S' , a set X C {0, 1}" of 
size poly(n, 1/e), and a 6 = poly (e) such that every g £ S' that satisfies Ai {f,g) [X] < 5 also 
satisfies Aqo if,g) < Likewise, S is L2-winnowable if A2 {f,g) [X] < 5 implies Aqo (/, 5) < £, 
and Loo-winnowable if A^o if ,9) [X] < 6 implies Aqo {f,g) < 

Clearly Loo-winnowability implies L2-winnowability implies Li-winnowability. The following 
lemma will imply that every set of functions with a small cover is Li -winnowable. 

Lemma 35 (Li-Winnowing Lemma) Let S be a set of functions f : {0, 1}" — t- [0, 1]. For some 
parameter e > 0, let C be a finite e-cover for S. Then there exists an f £ S, as well as a subset 
X Q {0,1}" of size 0(^log|C|), such that every g £ S that satisfies Ai {f,g)[X] < OAs also 
satisfies Aqo (/, 5) < 2e. 

Proof. We will consider functions P : S ^ [0,1], which we think of as assigning a probability 
weight P (g) to each function g £ S. In particular, given an f £ S and a subset of inputs 
X C {0, 1}", define 

Pf,x (g) :=exp(-Ai(/,5) [X]) . 

Clearly Pj^x if) = 1- Our goal will be to find / G 5 and X C {0, 1}", with \X\ =0(1 log 
such that every g £ S that satisfies Pf^x (5) ^ e~'^'^^ also satisfies A^o {f,g) < 2e. Supposing we 
have found such an {f,X) pair, the lemma is proved. 
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Consider the progress measure 



hec 

Clearly Mf^x < \C\ for all We claim, furthermore, that Mf^x > exp {-£ \X\) for all 

For since C is an e-cover for S, there always exists an h e C such that Ai {f,h) [X] < e \X\, and 

that h alone contributes at least cxp (— e |X|) to Mf^x- 

We will construct (/, X) by an iterative process. Initially / is arbitrary and X is the empty set, 
so Pf^x id) = 1 for all g, and Mf^x = \C\. Now, suppose there exists a g e S such that Pj^x (g) > 
e~°-^^, as well as an input y such that |/ (y) — g {y)\ > 2e. As a first step, let Y := X U {y} (that 
is, add y into our set of inputs). Then the crucial claim is that cither Mf^y or Mg^y is a 1 — (e) 
factor smaller than Mf^x- This means in particular that, by replacing X with Y (increasing \X\ 
by 1), and possibly also replacing / with g, we can decrease Mf^x by a 1 — (e) factor compared 
to its previous value. Since exp (— e \X\) < Mf^x < \C\, it is clear that Mf^x can decrease in this 
way at most 

o(.og,« I"! 



exp {—e \X\ 

times. Setting the above expression equal to \X\ and solving, we find that the process must 
terminate when |X| = O log |C|), returning an (/, X) pair with the properties we want. 
We now prove the crucial claim. The first step is to show that either 



or else 



heC 



M' := Pf^x (h) e-l^^^)-'^^^)! 
hec 

is at most 

For since ]/ (y) — g {y)\ > 2e, either \f (y) — h {y)\ > e or \g (y) — h{y)\ > e by the triangle inequal- 
ity. So for every y, either e"!'^^^)"'^^^)' < or e~^^^'^^~^^'^^^ < e~^. This in turn means that either 
Mf^y or M' must have at least half its terms (as weighted by the Pf^x (^)'s) shrunk by an 
factor. 
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If M/,y < i±f-^M/,x then we are done. So suppose instead that M' < " Mj^x- Then 

/iGC exp (-Ai (/, /i) [X]) 
<M'exp(Ai(/,<7) [X]) 
M' 



Pf,x (g) 

e-0.4e 

<(i-|^)m,x 



and we are done. 



RecaU that S is coverable if for ah e > 0, there exists an e-cover for 5* of size 2^°^^^'^'^^^\ We 
can now prove the following equivalence theorem. 

Theorem 36 A p-concept class S is coverable if and only if it is Li-winnowable. 

Proof. For Coverable=^Li-Winnowable: fix a subset 5' C S" and an e > 0. Let C be 

an e/2-cover for S' of size 2^°^^^"'"'^/^^ Then by Lemma 1351 there exists an / G S', as well as 
a subset X C {0,1}'^ of size O (^log|C|) = poly(n, 1/e), such that every g G S' that satisfies 
^1 if ^9) [X] < e/5 also satisfies Aoo (/, 5) < 

For Li-Winnowable^=^Coverable, we prove the contrapositive. Suppose there exists a 
function t{n,l/e), super polynomial in either n or 1/e, such that 5 has no e-cover of size 2*^"'-'^/^) 
(at least, for infinitely many n or Let p = poly (n, 1/e) and 6 = poly (e). Given a function / 

and subset X C {0, 1}", let L [f, X] be the set of all functions g such that Ai (/, g) [X] < 5. Then 
our goal is to construct a subset S' Q S for which there is no pair (/, X) such that 

• / G S', 

• X CI {0, 1}" is a set of inputs with \X\ = p, and 

• gGS'nL[f,X] implies Aoo (/, 9) < e. 

Let W := [2p/(5] . Also, call a set B of functions / : {0, 1}" — )• [0, 1] a sliver if there exists a 
set X C {0, 1}" with \X\ = p, as well a function a : X — )• [1^], such that 



feB^f{x)e 



a (x) — la (x) 



Vx G X. 



Then define a subset S" C S" by the following iterative procedure. Initially S' = S. Then so long 
as there exists a sliver B such that S' D B is nonempty, together with a function fs £ S such that 

geS'nB^Aoo{fB,9)<e, 



33 



remove B from 5' (that is, set S' := S'\B). Halt only when no more such slivers B can be found. 

As a first observation, the total number of slivers is at most {2"WY = 2P°^^^"'"'^/^-*. Thus, the 
above procedure must halt after at most 2P°^y("''^/^) iterations. 

As a consequence, we claim that S' must be nonempty after the procedure has halted. For 
suppose not. Then the sequence of functions Jb chosen by the procedure would form an e-cover 
for S of size 2P°^y("'^/^-' — since for all g £ S, we would simply need to find a sliver B containing g 
that was removed by the procedure; then fs would satisfy Aqo {fB-,9) < £• But this contradicts 
the assumption that no such e-cover exists. 

Finally, we claim that once the procedure halts, there can be no / G S' and set X ol p inputs 
such that Aoo (/, g) < £ for b\\ g £ S' r\ L [/, X\. For suppose to the contrary that such an (/, X) 
pair existed. It is not hard to see that for every {f,X), there exists a sliver B that contains / 
and is contained in L [/, X]. But then S' (1 B would be nonempty, and {B, /) would satisfy the 
condition g £ S' Ci B =^ Aqo {f,g) < So (or some other sliver containing /) would already 
have been eliminated in the process of forming S' . ■ 

A natural question is whether Lemma[35] and Theorem [36] would also hold with L2-winnowability 
or Loo-winnowability in place of Li-winnowability. The next theorem shows, somewhat surprisingly, 
that the use of the Li norm was essential. 

Theorem 37 There exists a p-concept class S that is coverahle, hut not L2-winnowahle or -Lqo- 
winnowahle. 

Proof. We prove a stronger statement: there exists a, finite p-concept class S, of size |5| < 2P°^y("), 
that is not L2-winnowable (and as a direct consequence, not Loo-winnowable either). To prove 
this, it suffices to find a set S with \S\ < 2P°'y("), as weU as a constant e > 0, for which the following 
holds. For all f £ S, subsets X C {0, l}" of size less than 2" — n^, and constants 6 depending on 
e, there exists a g G S such that A2 (/, ff) [X] < 6 but Aqo {f,g) > £ (at least, for all sufficiently 
large n). 

Let e be any constant in (0, 1), and let S be the class of all functions / : {0, 1}"" — t- [0, 1] of the 



Then clearly IS"! < (2")" , since we can form any / G S" by starting from the identically-0 function, 
then choosing n? inputs x (with repetition) on which to increment / by 1/n. 

Now let f £ S, and let X C {0, l}*^ have size \X\ < 2" — n^. Then we can 'corrupt' / to create 
a new function g G S as follows. Let Z he a set of n inputs x G {0, 1}" on which / (x) > (note 
that such a Z must exist, since J2x f (^) ~ f (^) — ^ pigeonhole principle, 

there exists a y £ {0, !}"■ \ X such that / (y) = 0. Fix that y, and define 



form 




where the a^^s are nonnegative integers satisfying 




xe{o,i} 



n 



2 



1 x = y 

g (x) ■= { f (x) — 1/n \i X £ Z 
f (x) otherwise 
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Clearly g E S and 



A2 {f,g)[X] = 




On the other hand, we have / (y) = and g (y) = 1, so A^{f,g) = 1. Therefore S is not 
L2-winnowable. ■ 
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